Learning Mixtures of Weighted Tree-Unions by Minimizing Description Length

نویسندگان

  • Andrea Torsello
  • Edwin R. Hancock
چکیده

This paper focuses on how to perform the unsupervised clustering of tree structures in an information theoretic setting. We pose the problem of clustering as that of locating a series of archetypes that can be used to represent the variations in tree structure present in the training sample. The archetypes are tree-unions that are formed by merging sets of sample trees, and are attributed with probabilities that measure the node frequency or weight in the training sample. The approach is designed to operate when the correspondences between nodes are unknown and must be inferred as part of the learning process. We show how the tree merging process can be posed as the minimisation of an information theoretic minimum descriptor length criterion. We illustrate the utility of the resulting algorithm on the problem of classifying 2D shapes using a shock graph representation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Mixtures of Tree-Unions by Minimizing Description Length

This paper focuses on how to perform the unsupervised learning of tree structures in an information theoretic setting. The approach is a purely structural one and is designed to work with representations where the correspondences between nodes are not given, but must be inferred from the structure. This is in contrast with other structural learning algorithms where the node-correspondences are ...

متن کامل

zoning of flood hazard in Nowshahr city using machine learning models

  The aim of this study is to predict and model flood hazard in the city of Nowshahr, Mazandaran province using machine learning models. The criteria and indicators affecting flood hazard were identified based on the review of resources, and then the indicators were converted into rasters in ArcGIS environment, and finally standardized by fuzzy method for use in the models. K-nearest neighbor ...

متن کامل

Comparison of Artificial Neural Network, Decision Tree and Bayesian Network Models in Regional Flood Frequency Analysis using L-moments and Maximum Likelihood Methods in Karkheh and Karun Watersheds

Proper flood discharge forecasting is significant for the design of hydraulic structures, reducing the risk of failure, and minimizing downstream environmental damage. The objective of this study was to investigate the application of machine learning methods in Regional Flood Frequency Analysis (RFFA). To achieve this goal, 18 physiographic, climatic, lithological, and land use parameters were ...

متن کامل

A Mixed Integer Programming Approach to Optimal Feeder Routing for Tree-Based Distribution System: A Case Study

A genetic algorithm is proposed to optimize a tree-structured power distribution network considering optimal cable sizing. For minimizing the total cost of the network, a mixed-integer programming model is presented determining the optimal sizes of cables with minimized location-allocation cost. For designing the distribution lines in a power network, the primary factors must be considered as m...

متن کامل

Fuzzy Programming for Parallel Machines Scheduling: Minimizing Weighted Tardiness/Earliness and Flow Time through Genetic Algorithm

Appropriate scheduling and sequencing of tasks on machines is one of the basic and significant problems that a shop or a factory manager encounters; this is why in recent decades extensive studies have been done on scheduling issues. One type of scheduling problems is just-in-time (JIT) scheduling and in this area, motivated by JIT manufacturing, this study investigates a mathematical model for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004